The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Choosing the values of hyper-parameters in sparse Bayesian learning (SBL) can significantly impact performance. However, the hyper-parameters are normally tuned manually, which is often a difficult task. Most recently, effective automatic hyper-parameter tuning was achieved by using an empirical auto-tuner. In this work, we address the issue of hyper-parameter auto-tuning using neural network (NN)-based learning. Inspired by the empirical auto-tuner, we design and learn a NN-based auto-tuner, and show that considerable improvement in convergence rate and recovery performance can be achieved.
translated by 谷歌翻译
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
我们提出了EasyRec,这是一个易于使用,可扩展和高效的推荐框架,用于构建工业推荐系统。我们的EasyRec框架在以下方面是优越的:首先,EasyRec采用模块化和可插入的设计模式来减少建立定制模型的努力;其次,EasyRec实现了超参数优化和特征选择算法,以自动提高模型性能;第三,EasyRec应用在线学习,以快速适应不断变化的数据分布。该代码发布:https://github.com/alibaba/easyrec。
translated by 谷歌翻译
延时摄影是在电影和宣传电影中使用的,因为它可以在短时间内反映时间的流逝并增强视觉吸引力。但是,由于需要很长时间才需要稳定的射击,因此对摄影师来说是一个巨大的挑战。在本文中,我们提出了一个带有虚拟和真实机器人的延时摄影系统。为了帮助用户有效拍摄延时视频,我们首先参数化延时摄影并提出参数优化方法。对于不同的参数,使用不同的美学模型,包括图像和视频美学质量评估网络,用于生成最佳参数。然后,我们提出了一个延时摄影界面,以促进用户查看和调整参数,并使用虚拟机器人在三维场景中进行虚拟摄影。该系统还可以导出参数并将其提供给真实的机器人,以便可以在现实世界中拍摄延时视频。此外,我们提出了一种延时摄影美学评估方法,该方法可以自动评估及时视频的美学质量。实验结果表明,我们的方法可以有效地获得延时视频。我们还进行了用户研究。结果表明,我们的系统具有与专业摄影师相似的效果,并且更有效。
translated by 谷歌翻译
时空活动预测,旨在预测特定位置和时间的用户活动,对于城市规划和移动广告等应用至关重要。基于张量分解或嵌入图的现有解决方案受到以下两个主要局限性的影响:1)忽略用户偏好的细粒度相似之处; 2)用户的建模是纠缠的。在这项工作中,我们提出了一个称为Disenhcn的超图神经网络模型,以弥合上述差距。特别是,我们首先将细粒的用户相似性和用户偏好和时空活动之间的复杂匹配统一为异质性超图。然后,我们将用户表示形式分为不同的方面(位置感知,时光和活动意识),并汇总相应的方面在构造的超图上的特征,从不同方面捕获了高阶关系,并解散了最终方面的最终影响。预言。广泛的实验表明,我们的DisenHCN在四个现实世界中的数据集上优于最新方法的最新方法14.23%至18.10%。进一步的研究还令人信服地验证了我们disenhcn中每个组件的合理性。
translated by 谷歌翻译
从医用试剂染色图像中分割牙齿斑块为诊断和确定随访治疗计划提供了宝贵的信息。但是,准确的牙菌斑分割是一项具有挑战性的任务,需要识别牙齿和牙齿斑块受到语义腔区域的影响(即,在牙齿和牙齿斑块之间的边界区域中存在困惑的边界)以及实例形状的复杂变化,这些变化均未完全解决。现有方法。因此,我们提出了一个语义分解网络(SDNET),该网络介绍了两个单任务分支,以分别解决牙齿和牙齿斑块的分割,并设计了其他约束,以学习每个分支的特定类别特征,从而促进语义分解并改善该类别的特征牙齿分割的性能。具体而言,SDNET以分裂方式学习了两个单独的分割分支和牙齿的牙齿,以解除它们之间的纠缠关系。指定类别的每个分支都倾向于产生准确的分割。为了帮助这两个分支更好地关注特定类别的特征,进一步提出了两个约束模块:1)通过最大化不同类别表示之间的距离来学习判别特征表示,以了解判别特征表示形式,以减少减少负面影响关于特征提取的语义腔区域; 2)结构约束模块(SCM)通过监督边界感知的几何约束提供完整的结构信息,以提供各种形状的牙菌斑。此外,我们构建了一个大规模的开源染色牙菌斑分割数据集(SDPSEG),该数据集为牙齿和牙齿提供高质量的注释。 SDPSEG数据集的实验结果显示SDNET达到了最新的性能。
translated by 谷歌翻译
在这项工作中,我们介绍了梯度暹罗网络(GSN)进行图像质量评估。所提出的方法熟练地捕获了全参考图像质量评估(IQA)任务中扭曲的图像和参考图像之间的梯度特征。我们利用中央微分卷积获得图像对中隐藏的语义特征和细节差异。此外,空间注意力指导网络专注于与图像细节相关的区域。对于网络提取的低级,中级和高级功能,我们创新设计了一种多级融合方法,以提高功能利用率的效率。除了常见的均方根错误监督外,我们还进一步考虑了批处理样本之间的相对距离,并成功地将KL差异丢失应用于图像质量评估任务。我们在几个公开可用的数据集上试验了提出的算法GSN,并证明了其出色的性能。我们的网络赢得了NTIRE 2022感知图像质量评估挑战赛1的第二名。
translated by 谷歌翻译
面部反欺骗研究被广泛用于面部识别,并受到行业和学者的更多关注。在本文中,我们提出了Eulernet,这是一个新的时间特征融合网络,其中差分过滤器和残留金字塔分别用于从连续帧中提取和扩增异常线索。基于面部标志的轻量级样品标签方法旨在以较低的成本标记大型样品,并且比其他方法(例如3D摄像头)具有更好的结果。最后,我们使用各种移动端来收集30,000个实时和欺骗样本,以创建一个数据集,该数据集在现实世界中复制各种形式的攻击。公共Oulu-NPU的广泛实验表明,我们的算法优于最先进的现状,我们的解决方案已经部署在现实世界中,为数百万用户提供服务。
translated by 谷歌翻译
自我监督学习(SSL)已取得了有希望的下游表现。但是,当面临现实世界应用程序中的各种资源预算时,将一一一个尺寸的多个网络预算到多个网络的巨大计算负担。在本文中,我们提出了基于歧视性SSL的可靠预处理网络(DSPNET),可以立即训练,然后缩小到各种大小的多个子网络,每个尺寸都可以忠实地学习良好的表示,并可以作为良好的初始化,以良好的初始化。具有各种资源预算的下游任务。具体而言,我们通过优雅地集成SSL和知识蒸馏,将微小网络的思想扩展到判别性SSL范式。我们在图像网上与网络与线性评估和半监督评估方案的一个单独预处理的网络表现出可比性或改进的性能,同时降低了较大的培训成本。预处理的模型还可以很好地推广到下游检测和分割任务。代码将公开。
translated by 谷歌翻译